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Abstract 

Motivated by fluorescence lifetime measurements this paper considers the 
problem of nonparametric density estimation in the pile-up model. Adaptive 
nonparametric estimators are proposed for the pile-up model in its simple form 
as well as in the case of additional measurement errors. Furthermore, oracle 
type risk bounds for the mean integrated squared error (MISE) are provided. 
Finally, the estimation methods are assessed by a simulation study and the 
application to real fluorescence lifetime data. 
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1 Introduction 

This paper is concerned with nonparametric density estimation in a specific inverse 
problem. Observations are not directly available from the target distribution, but 
suffer from both measurement errors and the so-called pile-up effect. The pile- up 
effect refers to some right-censoring, since an observation is defined as the minimum 
of a random number of i.i.d. variables from the target distribution. The pile-up 
distribution is thus the result of a nonlinear distortion of the target distribution. In 
our setting we also take into account measurement errors, that is the pile-up effect 
applies to the convolution of the target density and a known error distribution. The 
aim is to estimate the target density in spite of the pile-up effect and additive noise. 

The pile-up model is encountered in time-resolved fluorescence when lifetime 
measurements are o btained by the technique calle d Time-Correlated Single-Photon 



Counting (TCSPC) (10 'Connor and Phillips! . 1 19841 ) . The fluorescence lifetime is the 



duratio n that a molecu l e stays in th e excited state before emitting a fluorescence 



photon (ILakowiczl . Il999l ; IValeurl . |2002| ). The distribution of the fluorescence lifetimes 
associated with a sample of molecules provides precious information on the underlying 
molecular processes. Lifetimes are used in various applications as e.g. to determine 
the speed of rotating molecules or to measure molecular distances. This means that 
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the knowledge of the lifetime distribution is required to obtain information on physical 
and chemical processes. 

In the TCSPC technique, a short laser pulse excites a random number of molecules, 
but for technical reasons, only the arrival time of the very first fluorescence photon 
striking the detector can be measured, while the arrival times of the other photons 
are unobservable. The arrival time of a photon is the sum of the fluorescence lifetime 
and some noise, which is some random time due to the measuring instrument as 
e.g. the time of flight of the photon in the photon-multiplier tube. Hence, TCSPC 
observations can be described by a pile-up model with measurement errors. The goal 
is to recover the distribution of the lifetimes of all fluorescence photons from the 
piled-up observations. 

Until recently TCSPC was operated in a mode where the pile-up effect is negligible. 
However, a shortcoming of this mode is that the acquisition time is very long. Recent 
studies have made clear that from an information viewpoint it is a better strategy 



to op erate TCSPC in a mode with considerable pile-up effect (IRebafka et all 12010 



201ll ). Consequently, an estimation procedure is required that takes the pile-up effect 



into account. The concern of this paper is to provide such a nonparametric estimator 
of the target density and furthermore to include measurement errors in the model in 
order to deal with real fluorescence data. Therefore, we develop adequate deconvo- 
lution strategies for the correction in the pile-up model and test those methods on 
simulated data as well as on real fluorescence data. 

It is noteworthy that the pile-up model is connected to survival analys is, since it 
can b e considered as a special case of the nonlinear transformation model (ITsodikov . 



20031 ). Indeed, it is straightforward to extend the methods proposed in this paper 



to the more general case. Moreover, the mode l can also be viewed as a biased data 
problem with known bias ( iBrunel et all 120051 ). As a consequence, the first part of 
the study is rather classical. Nonetheless, the consideration of measurement errors in 
the second part is new and fruitful. Indeed, w e show that deconvo lution methods can 
be used to complete the study in the spirit of IComte et all (120061 ) . These techniques 
are of unusual use in both survival analysis and pile-up model studies. Numerical 
results confirm the adequacy of these methods in practice. 

In Section [2] a nonparametric estimation strategy for the pile-up model (without 
measurement errors) is presented to recover the target density. More precisely, a 
projection estimator is developed based on finite dimensional functional spaces and 
a tool is proposed to automatically select the model dimension achieving the best 
possible rate of convergence. In Section [3] additional measurement errors are taken 
into consideration leading to an estimator based on Fourier deconvolution methods. 
The rates obtained in this framework depend on the smoothness of the error density 
and on the choice of a cut-off parameter. Furthermore, a cut-off selection strategy is 
proposed to achieve an adequate bias- variance trade-off. In SectionHJthe performance 
of the methods is assessed via simulations and by an application on a dataset of 
fluorescence lifetime measurements. All proofs are relegated to Section 
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2 Nonparametric Estimator for the Pile-up Model 



This section introduces the pile-up model and presents the nonparametric estimation 
approach in the easier setting of the pile- up model before extending it in Section [3] to 
the pile-up model including additive noise. 



2.1 The pile-up model 

Let [Y k , k > 1} be a sequence of independent positive random variables with tar- 
get probability density function (pdf) fy and cumulative distribution function (cdf) 
F. Moreover, let N be a random variable taking its values in N* = {1, 2, . . . } inde- 
pendently of this sequence. Then an observation of the pile-up model is distributed 
as the random v a riable Z taking values in R + defined by Z = min{ Yi, Y^}. 



In iRebafka et all (120101 ) it is shown that the cdfG of Z, referred to as the pile-up 



distribution function, is given by 

G(z) = l-M(l-F(z)), 2GM+, (1) 

where M is the probability generating function associated with N defined as M(u) = 
E(m 7V ) for u G [0, 1]. Moreover, if F admits a density fy with respect to the Lebesgue 
measure on K + , then G admits a density g. Denoting M(u) = E(A r n Ar_1 ), M(u) = 
K(N(N — l)u N ~ 2 ) for all u G [0, 1], the pile-up density g is given by 

g(z)=f Y (z)M(l-F(z)), zeR + . (2) 

Note that the generating function M : [0, 1] — > [0, 1] is bijective for any distribution 
of TV and we denote its inverse function by M~ x . If E[iV 2 ] < oo and P(iV = 1) ^ 
0,F(N = 2) 7^ 0, then the functions M and M are bounded by some constants 
< a < b < +oo satisfying 

a < M(u) <b and a < M(u) < b for all u G [0, 1] . (3) 

Remark 2.1 In the more general nonlinear transformation model the function M : 
[0, 1] — > [0, 1] in (JTJ is not necessarily a p robability genera ting function, but any func- 



tion M such that G given by (JTJ is a cdf ( iTsodikovl . 120031 ) . That is G is still the result 



of a distortion of the target distribution F, but the interpretation as a minimum is no 
longer valid. Those models are studied in survival analysis. The estimators proposed 
in this paper for the pile-up model are also applicable for nonlinear transformation 
models. 

Main example. In the fluorescence application it is assumed that the number N of 
photons per excitation cycle follows a Poisson distribution with known parameter [i. 
Note that the events where no photon is detected, i.e. N = 0, are discarded. Hence, 
we consider a Poisson distribution restricted on N* with renormalized probability 
masses given by P(iV — k) — fi k /k\/(e^ — 1). As fi is supposed to be known, the 
functions M and M are known as well and given by M{u) = (e M " — l)/(e M — 1) and 
M(u) = fie^/ie^ - 1). 
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2.2 Estimator of the target density in the pile-up model 

The goal is to estimate the target density fy from i.i.d. observations Zi,...,Z n of 
the pile-up distribution G. We propose a nonparametric estimator by searching in a 
collection of functions the one that best fits the data or, in other words, the orthogonal 
projection of fy onto the function space. If S is an adequate subspace of L 2 , the 
orthogonal projection of fy on S in the L 2 -sense is the minimizer of \\fy — h\\ 2 for h 
in S, or equivalently, the minimizer of \\h\\ 2 — 2(h, fy). 

As (h, fy) = E(/i(y)), we need an approximation of moments E[/i(Y)] based on 
pile-up observations. We note that inverting relation (CQ) gives 1 — F(z) = M _1 (l — 
G(z)). Plugging this relation into ([2]), we obtain 

fy(z) = — ^— ^ = w o G(z) g(z) with w(u) 



M(M~ 1 (1 -G(z))) M(M" 1 (l-n)) 

This allows us to relate moments of the target distribution F with moments of 
the pile-up distribution G. More precisely, for any bounded function h the following 
equality holds 

E[h(Y)} =E[h(Z) woG(Z)) . (4) 

To construct an estimator of the moment E[/i(Y)] based on pile-up observations, 
relation (j3J) suggests to replace the distribution function G by its empirical version 
G n (z) = Ya=i ^{Zi<z}/ n - Then an estimator of E[/i(Y)] is given by 

i n 1 n 

- ^ K Z i) w ° G n(Zi) ^ h(Z {i) )w{i/n) , (5) 

71 i=l U i=l 

as w o G n (Z(^) = w(i/n) and where Z^ denotes the i-th order statistic associated 
with (Z 1 , . . . , Z n ) satisfying Z^ < ■ ■ • < Z( n y In the literature such weighted sums 
of order statistics are known as L-statistics. 

The approximation of moments E[/i(Y)] by an L-statistic is the key property used 
in the nonparametric estimation strategy that is proposed in the following. In the 
pile- up model the weights w(i/n) can be viewed as "corrections" of the observations 
Zi as they do not follow the target distribution F, but the pile-up distribution G. 
The weights are bounded because inequality (jSJ) ensures that there exist constants 
wo, w\ such that 

Vw G [0, 1], < w < w(u) < wi < oo . (6) 

The computation of the estimator in (jSj) requires the knowledge of the weight function 
w, which is entirely determined by the distribution of N. Hence, in the example above 
on the Poisson distribution w writes 

1 - e" M 

w ( u ) — —T-, 7\ 7T i (7) 

v ; /i(n(e^ 1 - 1) + 1) w 
with corresponding constants Wq — (1 — e~ M )//i and w\ = (e M — l)/fjt. 
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A standard estimation approach of the target density fy consists in approxi- 
mating the orthogonal projection of fy onto some function space. More precisely, 
we suppose that the restriction of fy on some interval A is square integrable, i.e. 
fyl-A £ L 2 (v4). For a given orthonormal sequence (^a) AgA m in L 2 (y4) define the sub- 
space S m = Span((y?A, A G A m ). The cardinality of A m (which is also the dimension of 
S m ) is denoted by D m and supposed to be finite. 

By using the moment estimator proposed in (JSJ), an approximation of the projec- 
tion of fy onto S m can be defined as 

2 n 

f m = arg min j n (h) with j n (ti) = \\h\\ 2 V"/i(Z (i) ) w(i/n) , 

hes m n 

1=1 

since 7 n (/i) is an estimator of \\h\\ 2 — 2K[h(Y)]. Note that the explicit formula of the 
estimate is given by 

1 n 

fm = ^2 ^ X witn a A = - ^v 9 A(%))w(i/n) . (8) 

AeA m i=l 

For this estimator the following risk bound is shown in Section [5j 

Proposition 2.1 Lei / m fre i/ie orthogonal projection in the h 2 -sense of fy on S m . 
Assume that ^ holds and that w is Lipschitz continuous, i.e. 

there exists c w > such that \w(x) — w(y)\ < c w \x — y\ . (9) 

Assume moreover that 

there exists $o > such that || V^aIIoo < $o^m , (10) 

AeA m 

then 

n\\L - fYt A \\ 2 ) < \\f Y t A - u 2 + , (ii) 

n 

where C depends on $o, w\ and the Lipschitz constant c w of w. 



Remark 2.2 It follows from equation ((3]) that the Lipschitz constant c w verifies 
c w <b/a 3 since w'(u ) = IoM- 1 (l-u)/[MoI- 1 (l-ti)f. In the Poisson example 
where w is given by ([7]) we have c w = (e M — l) 2 //x. 

2.3 Examples of model collections 

Our goal is the estimation of fy in a nonparametric setting without knowledge of 
the best approximation space. Instead of a single space S m , we rather consider a 
collection {S m , m e -M„} of models and we thus have to face the problem of model 
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selection. Before presenting an estimator of the model m, we give some illustrating 
examples of model collections S m and we discuss some general conditions for the 
approximation spaces under which our estimation approach performs well. 

In the following A is supposed to be a compact set. For simplicity, we set A = [0, 1]. 
[T] Trigonometric spaces S m are generated by the functions 



{1, 2 1/2 cos(2vrjx), 2 1/2 sin(2vrjx) for j 



m} . 



The dimension of S m is D m = 2m + 1 and we may take m G M. n = {1, . . . , [n/2] — 1}. 
[DP] Dyadic piecewise polynomials spaces of degree r on the part ition of [0, 1| given 



Birge and Massartl 



by the subintervals I d = [(j - 1)/2 P , j/2 p ] for j = 1, . . . , 2 P , see 
(Il997h . Section 4.2.2. 

W| Dyadic wavel et generated spa c es wit h regularity r and compact support, see e.g. 



Daubechiesl (119921 ); iDonoho et all (11990 ). 

We now give the key properties that a general model collection {S n 
must fulfill to fit into our framework. 



m e M n } 



("Hi) Norm connection: {S m , m G M n } is a collection of finite dimensional linear sub- 
spaces of L 2 ([0, 1]) with dimension dim(S m ) = D m satisfying D m < N n < n, 
Vm G Ai n and 



There exists $ > such that ||t ||oo < <^> -D^ 2 p|| , for all m G M n ,t G S ri 



(12) 



Let (v^a) AeA m be an orthonormal basis of S,, 



t , where |A m | 



D r 



l . It follows from Birge 
and Massart (1997) that Property (112jl in the context of (Hi) is equivalent to (11 01) 
for all m G M n . This condition is easily che cked for collection JT| wit h $o = 1- For 
collection [DP] see a detailed description in iBirge and Massartl ( 119971 ). Section 2.2, 
showing that condition (fit)]) holds with $q = r + 1. It is known that (Tl0|) is also 
satisfied for wavelet bases [W]. 

Additionally, for results concerning adaptive estimators the following assumption 
is required. 

(H2) Nesting condition: {S m ,m G -M n } is a collection of models such that there 
exists a space S n belonging to the collection such that S m C «S n for all m G M, n - 
Denote by N n the dimension of S n , i.e. dim(5 n ) = N n < n. 

This condition ensures that D m < N n for all m G M, n . 



Another key property of those spaces lies in the bias evaluation. Indeed, if we 
assume that jy 1a = Ja belongs to a ball of some Besov space Bc^. poiA) withr+1 > a 
then for ||/a|U,2,oo < L we have 
Lemma 12). Thus, choosing D m * 
mean square risk satisfies E(||/ m 



1999 



Ma ~ U\ 2 < C(a,L)D~ 2a flBarron et all 
= 0(n 1/(2a+1) ) in Inequality ^ yields that the 
- Ja || 2 ) < 0(n~ 2a /( 2a+1 )). This rate is known 
to be optimal in the minimax sense for density estimation for direct observations 
(IDonoho et all Il996l ). 
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2.4 Adaptive estimator 



From the risk bound ( JTTT) it is clear that a bias-variance trade-off must be achieved. 
The idea consists in searching the model m that minimizes the risk bound (fTTT) . As 
Wfy — /m|| 2 = ||/y|| 2_ II ,/m|| 2 , this is equivalent to minimize — \\f m \\ 2 + CD m /n, where 
the term — ||/ m || 2 can be estimated by — ||/ m || 2 = 7n(/m)- Consequently, we propose 
the following model selection device 

rh = arg min [j n (fm) + pen(m)] , (13) 

m£Mn 

where the penalty term pen(m) is of the same order as the variance, i.e. CD m /n. 
Using this approach the following result can be shown. 

Theorem 2.1 Consider collections [DP] or [W] with N n < 0(n) or collection [T] 
with N n < 0{y/n) and assume that fy is bounded on A, i.e. ||/y||oo < oo. Let rh be 
defined by $T3\) with 

pen(m) = k I / w 2 (u)du I — — . (14) 
\Jo J n 

Then there exists a numerical constant k such that we have 

|2 , ( f 1 D ™\ , r^ lll2 ( n ) 



E(||/ - /mil 2 ) < C inf ||/ - W + / w\u)du — + K^^, (15) 
meM n \ V./o J n J n 



where C is a numerical constant and K depends on c w , ||/y||oo and the basis. 

Risk bounds of the form ( !T5l ) are often called oracle inequality. Note that the last 
term c In 2 (n)/n is clearly negligible with respect to the order of the infimum (in 
particular, in all Besov cases described above). 

In practice, the numerical constant k is calibrated by simulation experiments based 
on a few samples. The selection of rh in ( IT3|) is numerically easy, since the values of 
7n(/m) are given by = — ^AeA m with a x is defined in (JS}. 

The proof of the theorem relies on Talagran d 's ine quality and follows the line of 



the proof of Theorem 4.2 in iBrunel and Comtd (120051 ) . Therefore, only a sketch of 



the proof is provided in Section [5j 



3 Pile-up Model with Measurement Errors 

In this section we consider the context where the random variables Yj are affected by 
additional measurement errors. More precisely, the observations have the following 
form Z = min{Yi+?7i, . . . , Yn+tjn}, where the measurement errors r]i are independent 
of Yi and have known density f Tj with support in M, + . The pdf / of X = Y + 1] is the 
convolution of fy and f v denoted by / = fy * f v . We denote by u^t) = J e~ ltx u(x)dx 
the Fourier transform of an integrable function u. 
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3.1 Estimation procedure and risk bound 



In the context of piled-up observations with measurement errors, since obviously 
f Y = fx/fni one ma y consider the natural plug-in estimator of fy given by fy t m(x) = 
(27r) _1 f*™ m e txu frh(u)/f*(u)du, provided that the Fourier transform of exists. 
However, this approach leads to an accumulation of the estimation errors of the two 
stages. It is known that especially the application of the inverse Fourier transform is 
particularly unstable. Hence a better solution may be obtained by a direct approach. 

To this end we note that in this set-up the "pile-up property" given by (j4j) holds 
for X = Y + 7], that is K(h(X)) = K(h(Z)w o G(Z)). Hence, a direct estimator of the 
Fourier transform f x is given by 

— 1 n 

/£(«) = -£ e "^ {fc)ttw (*A0' ( 16 ) 

k=l 

and finally an estimator of the target density fy can be defined as 

1 firm 7*( v ) 

Ux) = - / e^^-du . (17) 
2tt J_ vm f*(u) 

For this estimator, the following risk bound can be shown. 

Proposition 3.1 Assume that w satisfies (0|) and (TJJ). Let f Y ^ m denote the function 
verifying f Ym = Then 

mfm-fvW 2 ) < \\fy-fY, m \\ 2 + C^^- where \{m) = ± / , (18) 

and C depends on J* w 2 (u)du and on the Lipschitz constant c w of w. 



Note that \\f Y - / y , m || 2 = (27T)- 1 \f Y (u)\ 2 du. 

Obviously, the variance depends crucially on the rate of decrease to of /* near 
infinity. For instance, if f v is the standard normal density, the variance is proportional 
to i| u | <7rm e n / 2 d«/n, whereas for the Laplace distribution (i.e. f v {x) = e~^/2) we 
have 1/ f*(u) = 1 + u 2 and a variance of order 0{m A /n). 



3.2 Other ways to view the estimator 

The estimator f m can also be derived in a different way. Recall that in Subsection 12.21 
we defined an estimator by minimizing the contrast j n (h) which is an approximation 
of \\h\\ 2 - 2E[h(Y)]. Writing E[h(Y)] = (h,f) = (2n)^(h*J Y ) = ±(h*,f x /f;) 
suggests to consider functions h in S m = {h, support(/i*) C [— 7rm, Tim]} and the 
new contrast 



ll(h) = \\h\\ 2 - ± J h*(-u) 
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where fa is given by ( jlpj) . Now we can see that the estimator f m minimizes the 
contrast 7+. Indeed, note that fa{u) = f x /f*{u) l[„ 7rmj7rm] (n) and thus f m G S m . 
By Parseval's formula (hj m ) = fa)' 1 (h\ fa). This yields that jl(h) = \\h\\ 2 - 
2(/i,/m) = \\h - 7m|| 2 - H/mll 2 . Therefore, / m = argmin he5m jl(h). 

Another expression of the estimator is obtained by describing more precisely the 
functional spaces S m on which the minimization is performed. To that aim, let us 
define the sine function and its translated-dilated versions by 

sin(7rx) 

<f(x) = and <f m j[x) = \/m<p(mx — j) , (19) 

TlX 

where m is an integer that can be taken equal to 2 e . It is well known that {(p m ,j}j& is 
an orthonormal basis of the space of square i ntegrab l e fun ctions having Fourier trans- 



forms with compact support in [— nm, irm] (IMeyerl . Il990l . p. 22). Indeed, as <p*(u 



1 [— 7r,7r] ( w ) j an elementary computation yields that Lf m j(x) = m l l 2 e lx i/ m t^ 



x 



Thus, the functions ip m j are such that S m = Span{<y2 m , j G Z} ={liG L 2 (M), supp(/i*) C 
[—rrnr, rrnr]}. For any function h G L2QR), let U m (h) denote the orthogonal projection 
of h on S m given by U m (h) = Y,jez a m,j(ti)<Pm,j with a m j(h) = J R <fi md (x)h(x)dx. As 
a m ,j(h) = (27r)- 1 (v? mj ,/i*), it follows that IL m {h)* = h*t[_ w ,„ h7rrn] , and thus f Y>m = 
n m (/y). Since f m minimizes 7^, this yields that the estimator f m can be written in 
the following convenient way 

/m = 5]4,^m J with a m>j = — I ip m>j (-u)^—^-du . (20) 

Consequently \\fa\\ 2 = Y,j l«m,j| 2 - 

Finally, one can see that J2jez l Pm,j( u ) i Prn,j(x) = e~ lxu l\ x \< nm . This is another 
way to see that ( 1201) and ( |T71) actually define the same estimator. 

Remark 3.1 An interesting remark follows from equation ( l20i) . In the case where 
no noise has to be taken into account, i.e. f~(u) = 1, the integral in (|20|) becomes 
/ L P* m ,j(- u ) e ~ luZkdu = 2nip mJ (Z k ). Hence, a mJ = (1/n) Y,k=i ¥m,j(Z( k ))w(k/n). We 
recognize the coefficients of the estimators given by formula (jSj) of the setting in 
Subsection 12.21 when the orthonormal basis (<p\)\ is the sine basis. 



3.3 Discussion on the type of noise 

To determine the rate of convergence of the MISE, it is necessary to specify the type 
of the noise distribution. Here two cases are considered. First, the noise distribution 
can be exponential with density given by f v (x) = 6e~ 9x t x> o , for some 6 > 0. Then 
we have f*(u) = 6/(6 + iu), |/*(u)| 2 = 1/(1 + u 2 /6 2 ) and A„(m) = m + ir 2 m 3 /(36 2 ). 

In the fluorescence setting, we found that TCSPC noise distributions can be ap- 
proximated by densities of the following form 
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Figure 1: Normalized histogram based on a sample of the noise distribution (solid 
line) and the fitted density (dashed line) having the form of (12 ip with a = 0.961, 
(5 = 0.941, v = 5.74, f = 5.89. 



with constraints a > (3, v < r, f3r/{av) > 1. Figured] presents a dataset with 259,260 
measurements from the noise distribution of a TCSPC instrument (independently 
from the fluorescence measurements) and the corresponding estimated density having 
form ( l2Tj) obtained by least squares fitting. Even though the fit is not perfect, the 
estimated density captures the main features of the dataset. Thus densities of the 
form (I2T1) can be considered as a good approximative model of the noise distribution 
in the fluorescence setting. In the general case of (I2TI) we have 

f*( u ) = av 1 P T 1 

r> a — (3 v + iu a — (3 r + iu 

In the simulation study we will consider a noise distribution of the form (I2~T|) with 
parameters a — 2, /3 = 1, v = 1, r — 2. In this case we get 

l/»| 2 = (1 + „ 2) 4 (4 + ^) md A,H = m+l.W + l.W. (22) 

From the application viewpoint it is hence interesting to consider the class of noise 
distributions rj whose characteristic functions decrease in the ordinary smooth way of 
order 7, denoted by 77 ~ OS(-y), defined by c (l + u 2 )~^ < |/*(u)| 2 < C (l + u 2 )~^. 
Clearly, we find that A v (m) = 0(m 27+1 ). 

3.4 Rates of convergence on Sobolev spaces 

In classical deconvolution the regularity spaces used for the functions to estimate are 
Sobolev spaces defined by 

C(a,L) = j# e (L 1 fl L 2 )(IR), J (1 + u 2 ) a \g*(u)\ 2 du < l| . 
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If jy belongs to C(a, L), then 



Wfr-fvA 2 = / \f Y (u)\ 2 du= (l+u 2 r\fy(u)\ 2 /(l+u 2 rdu 

J\u\>Tcm J \u\>Tcm 

< (1 + {Tim) 2 )- a L < L{7im)- 2a . 

Therefore, if jy € C(a,L) and 77 ~ OS^), Proposition 13.11 implies that E(||/ m — 
/y|| 2 ) < C\vrC 2a + C2^ _1 m 27+1 . The optimization of this upper bound provides the 
optimal choice of m by m opt = 0(n 1 ^ 2a+27+1 )) with resulting rate E(|| f m — fy\\ 2 ) < 
Q^ n -2a/{2a+2-y+i)y y[ 0Te formally, one can show the following result. 

Proposition 3.2 Assume that the assumptions of Proposition [Xil are satisfied and 
that f Y e C(a, L) and 77 ~ OS(j), then for m opt = 0(n 1/(2a+27+1) ), we have 

n\\f mopt -fy\\ 2 )<0(n- 2 ^ 2a+2 ^). 

Obviously, in practice the optimal choice m opt is not feasible since a is and part of 
the constants involved in the order are unknown. Therefore, another model selection 
device is required to choose a relevant f m in the collection. 

3.5 Model selection 

The general method consists in finding a data driven penalty pen(.) such that the 
following model 

rh = arg mm (jl(f m ) + pen(m)) (23) 

achieves a bias-variance trade-off, where Ai n has to be specified. In contrast to 
this general approach our result involves an additional ln(n)-factor in the penalty 
compared to the variance order, which implies a loss with respect to the expected 
rate derived in Section 13.41 

Theorem 3.1 Assume that f Y is square integrable on R, 77 ~ 0^(7) and w satisfies 
and |pp. Consider the estimator fm with model rh defined by [2&\l with penalty 

pen( m ) = K > (j X w 2 (u)du + k"c 2 w ln(n)^ , (24) 

where k' and k" are numerical constants. Assume moreover that r] is ordinary smooth, 
i.e. 77 ~ OS^), and that the model collection is described by M. n = {m £ N, A v (m) < 
n} = {1, . . . ,m n }. Then, there exist constants k', k" such that 



E(||jW-/ y || 2 ) <c( inf \\f Y -f Y)m \\ 2 + W n( m ))+C 



Mn) (25) 



n 

where C is a numerical constant and C depends on c w and the bounds on w. 
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As previously, the numerical constants k' and k" are calibrated via simulations. In 
practice, to compute rh by (123|) . we approximate 7^(/ m ) by — J2\j\<K n \®m,j\ 2 , where 
the sum is truncated to K n of order n. 

In the fluorescence set-up, the noise distribution f v is generally unknown. How- 
ever, independent, large samples of the noise distribution are available. Hence one 
may still use the procedure proposed above by replacing /* with the estimat e f*(u) 



E 



-iuri-k 



/n, where {f}-k)i<k<M denotes the independent noise sample. 



In lComte and Lacour 
It is shown 



( 120091 ) the same substitution is considered for deconvolution methods 
that for ordinary smooth noise this leads to a ris k bound exactly analogou s to the 
one given in ( 1251) . The main constraint given in IComte and Lacour! (120091 ) is that 
M > n 1+e , for some e > 0. As the noise samples provided in fluorescence have huge 
size, this condition is certainly fulfilled in our practical examples. In the following 
numerical study we consider the estimator with both the exact /* and an estimated 

f*. 



4 Numerical results for simulated and real data 

In this section we first give details on the practical implementation of the estima- 
tion methods. Then a simulation study is conducted to test the performance of the 
methods in different settings. Finally, an application to a sample of fluorescence data 
shows that the estimation method gives satisfying results on real measurements. 



4.1 Practical computation of estimators 

In the case of no additional noise, we apply the method described in Section [2] with the 
trigonometric basis [T]. To determine the best model rh we compute j n (m) +pen(m) 
for all m = 1, . . . , [n/2] — 1. This is computationally easy as the following recursive 
relation can be used. We have 7 n (0) + pen(0) = — a% + kW/ti, 7n(l) + pen(l) = 
— Oq — a\ 1 — of 2 + k3W/ n and 7 n (m + 1) + pen(m + 1) = 7 n (m) + pen(m) — a^ +1 1 — 
tirn+i 2 + 2,K,W/n, for all m > 1, where W = J w 2 (u)du. The coefficients are given by 
(JSj) . Then rh is the value where 7 n (m) + pen(m) achieves its minimum. Finally, the 
estimator of / is given by / A = Ea 6 a, ;i 

In the case of additional noise, we use the estimator proposed in Section [3] based 
on the sine basis. Its computation is more intensive as no similar recursive relation 
holds. First one has to compute the coefficients a m j defined in (120]) . For j > they 
can be approximated as follows 

a ■ = J- /V .(-n)SM dn=( _ iy v^ f 2 ^ fj^mjv - 1)) 

« (_iy V^y e ^/T /x(™Ct - 1)) = { _ iyv ^ {lFFT{H)) = am 
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Gamma 



Exponential 



Pareto 



Weibull 




0.0434 2.325 9.6035 81.010 

Estimation with fi = 0.01, n = 1000 




0.0519 2.3306 9.6801 80.1796 

Estimation with \i = 0.5, n = 1000 




0.0610 2.4329 9.7500 82.6353 

Estimation with /i = 2, n = 1000 



Figure 2: True density and 25 estimated curves without measurement errors. Esti- 
mation with the trigonometric basis for different levels of the pile- up effect. Numbers 
below the figures are the MISE. 

where IFFT(H) is the inverse fast Fourier transform of the T- vector H whose t-th 
entry equals f^(7im(2t/T—l))/f*(Trm(2t/T — 1)). Similarly, for j < the coefficients 
approximated by a m j = (— iy y/m(lFFT(H))j. 
The integral A r) (m) appearing in the penalty term pen(m) defined in (1241) is explic- 
itly known if f v is known (see Section [373]) . In the case when we only have an estimator 
fjj, A J? (m) can be approximated by a Riemann sum of the form [raj S) X^ s =o \ fn (~ 7rm (l 
Then the best model m is selected as the point of minimum of the criterion given in 
( )23l) . Finally, we obtain the estimator fm = Y^j=-T Qmjtpmj with the sine functions 
(fimj defined in ( Tl9l) . 



Figure |2] and |3] present the visual summary of our simulation results. We imple- 
mented the estimation methods when fy has one of the following pdfs. 

1. a Gamma(3, 3) p.d.f, l/(2!3 3 )x 2 exp(— x/3)1 x> q, to have a benchmark with a 
smooth distribution, 

2. an exponential pdf, (1/3) exp(— x/3)l x> o, 
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Gamma Exponential Pareto Weibull 




3.36 (0.49) 14.2 (2.1) 22.4 (1.8) 43.1 (5.9) 

Estimation with a = 0.7, = 0.01, n = 2000 




3.36 (0.57) 15.2 (1.6) 27.1 (3.6) 48.9 (6.3) 

Estimation with a = 0.5, fi = 1, n = 2000 




3.00 (0.00) 15.3 (2.0) 33.6 (3.5) 81.0 (7.6) 

Estimation with er = 0.1, fi = 2, n = 2000 

Figure 3: True density and 25 estimated curves. Estimation by deconvolution with 
sine basis for different noise levels and different levels of the pile- up effect. Numbers 
below the figures indicate mean and standard deviation of the selected model fh. 



3. a Pareto(l/4, 1, 0) pdf (1 + x/4)- 5 l x>0 , 

4. a Weibull(l/4, 3/4) pdf (3/4)(l/4)- 3 / 4 x" 1 / 4 exp(-(4x) 3 / 4 )l x>0 . 



The last two densities are inspired by c hemical results about fluorescence phenomena 



given m 



Berberan-Santos et al. f l2005aH bl) 



4.2 Simulation study 



When no noise is added we applied the method described in Section [2] with the 
simple trigonometric basis. The numerical constant k of the penalty (fl4|) is set to 0.5 
resulting from a previous calibration by simulation. The Poisson parameter varies 
from 0.01 over 0.5 to 2. The mean MISE over 25 paths are computed on the intervals 
of representation. From Figure [2] one can see that the results are rather good, in 
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Exponential noise 





(0.2,0.5) 


(0.2,1.5) 


(0.2,2) 


(1,0.5) 


(1,1.5) 


(1,2) 


Gamma 


.063 (.042) 
.063 (.042) 


.081 (.045) 
.081 (.045) 


.112 (.026) 
.112 (.026) 


.061 (.039) 
.061 (.039) 


.088 (.040) 
.087 (.040) 


.115 (.028) 
.115 (.028) 


Exponential 


1.11 (0.22) 
1.11 (0.22) 


1.20 (0.26) 
1.19 (0.25) 


1.45 (0.21) 

1.46 (0.21) 


1.36 (0.26) 
1.36 (0.27) 


1.40 (0.24) 
1.40 (0.24) 


1.67 (0.27) 
1.67 (0.27) 


Pareto 


4.25 (0.82) 
4.23 (0.83) 


4.55 (0.58) 

4.56 (0.61) 


5.45 (0.84) 
5.47 (0.83) 


6.62 (1.5) 
6.62 (1.6) 


6.58 (0.95) 
6.58 (1.0) 


8.09 (1.2) 
8.09 (1.2) 


Weibull 


10.6 (6.7) 
8.54 (4.7) 


9.46 (5.0) 
9.40 (4.8) 


9.22 (2.7) 
9.30 (2.3) 


21.4 (4.1) 
22.1 (4.8) 


26.7 (5.6) 
26.7 (5.7) 


39.5 (5.9) 
40.1 (5.7) 


Bi-exponential noise 




(0.2,0.5) 


(0.2,1.5) 


(0-2,2) 


(1,0.5) 


(1,1.5) 


(1,2) 


Gamma 


.060 (.032) 
.060 (.032) 


.075 (.040) 
.075 (.040) 


.113 (.023) 
.113 (.023) 


.061 (.048) 
.062 (.048) 


.088 (.043) 
.089 (.043) 


.114 (.025) 
.114 (.025) 


Exponential 


1.06 (0.20) 
1.06 (0.20) 


1.14 (0.17) 
1.14 (0.16) 


1.49 (0.26) 
1.48 (0.25) 


1.23 (0.27) 
1.25 (0.26) 


1.37 (0.28) 
1.37 (0.28) 


1.62 (0.28) 
1.62 (0.27) 


Pareto 


4.15 (0.76) 
4.14 (0.77) 


4.31 (0.69) 
4.30 (0.69) 


5.08 (0.71) 
5.07 (0.72) 


6.08 (1.5) 
6.11 (1.6) 


6.41 (1.1) 
6.49 (1.2) 


7.43 (1.0) 
7.45 (1.1) 


Weibull 


10.2 (6.1) 
8.25 (4.3) 


8.89 (5.6) 
8.75 (5.4) 


8.29 (2.1) 
8.31 (2.2) 


24.7 (3.9) 
24.9 (4.3) 


29.4 (4.5) 

29.5 (4.9) 


40.1 (5.2) 
40.4 (5.3) 



Table 1: 100 x mean MISE and standard deviation in parentheses. First lines corre- 
spond to exact noise distribution, second lines give results obtained with estimated 
noise distribution. 

spite of small side effects which would be avoided with piecewise polynomial bases. 
From this point of view all representations in Figure [2] are cut on the right. We see 
that the estimator performs well for a large range of values of the Poisson parameter. 
The first row corresponds to data where the pile-up effect is negligible, as the Poisson 
parameter is equal to 0.01, and hence serves as a benchmark. Here estimation errors 
are mainly due to the choice of a trigonometric basis, that easily recovers the Gamma 
density while the Weibull density is much harder to approximate in this basis. In the 
other rows the pile-up effect is considerably increased, however the accuracy is hardly 
affected and the estimator is still rather stable. The pile-up effect is hence correctly 
taken into account in the estimation procedure. 

The adaptive estimator described in Section [3] is tested with the numerical con- 
stants k' = 1 and k" = 0.001 in ( 124)) . The value of k" is very small and makes the 
logarithmic term in general negligible except when (? w is large (for instance (? w ~ 416 
for fi = 2). The results are given in Figure |3j Now the observations are Y = X + rj, 
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where i] = as. In the first row, the pile-up effect is almost negligible (/i = 0.01), but 
a is rather large. That is, the first row illustrates the performance of the deconvo- 
lution step of the estimation procedure. In contrast, for the last row a is taken to 
be small, but the pile- up effect is significant (/i = 2), to see how the estimator copes 
with the pile- up effect. The second row is an intermediate situation, illustrating how 
the estimator performs when the variance of the noise and the pile-up effect are both 
non negligible. 

The 25 curves indicate variability bands for the estimation procedure. They show 
that the estimator is quite stable, especially in the last rows. Moreover, the selected 
model order fa is different from one example to the other. Globally the dimension m 
increases when going from example 1 to 4. That means that the estimator adapts to 
the peaks that are more and more difficult to recover. 

In Table [1] the MISE of the estimation procedure is analyzed. The table gives 
the empirical mean and standard deviation of the MISE obtained over 100 simulated 
datasets. This is done for the same four examples of distributions as above. We 
compare the error for the estimator using the exact noise distribution to the estimator 
based on an approximation of the noise distribution based on an independent noise 
sample of size 500. Moreover, we study the influence of the noise distribution on the 
estimator. Therefore, we consider, on the one hand exponential noise with variances 
a 2 G {0.2, 1}, and on the other hand density ( 12~TT) with a = 2, (3 = 1, u = 1, r = 2 
(multiplied with adequate constants to have same variance a 2 as for the exponential 
distributions). 

From Table [j] it is clear that increasing the variance of the noise distribution 
increases the error. Furthermore, changing the type of the noise does not influence a 
lot the estimation procedure. Indeed, the second case (|2~Tj) is just slightly less favorable 
than the exponential distribution. This difference is in accordance with Proposition 
13.21 that holds with 7 = 1 for the exponential and with 7 = 2 for the other density. 
The comparison with the results based on an approximated noise distribution (second 
lines) reveals that there is rarely a difference between the two methods. Indeed, using 
an approximation of the noise does not corrupt the results, in some cases we even 
observe an improvement of the error. We show in Figure H] that it is indispensable to 
take into account both the pile- up correction (which is omitted in (b) where w(i/n) 
is replaced by i/n) and the deconvolution correction (which is omitted in (c) where 
the estimation is done with the method of Section [2] and the trigonometric basis). 
Thus, we conclude from these simulation results for the fluorescence setting that it is 
justified to use an estimate of the noise instead of the theoretical distribution. 

4.3 Application to Fluorescence Measurements 

We finally applied the estimation procedure to real fluorescence lifetime measurements 
obtained by TCSPC. The data analyzed here are graphically presented in Figure[5](a) 
by the histogram of the fluorescence lifetime measurements and the histogram of the 
noise distribution based on a sample obtained independently from the fluorescence 
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(a) (b) (c) 

Figure 4: (a) Estimation with pile-up correction and deconvolution. (b) No pile-up 
correction, (c) No deconvolution. 



measurements. The sample size of the fluorescence measurements is n — 1, 743, 811. 
The same sample of the noise distribution has already been considered in Figure [H 
where it is compared to the parameterized density given by (12T1) . In this setting the 
true density is known to be an exponential distribution with mean 2.54 nanoseconds 
and the Poisson parameter equals 0.166. The knowledge of the true density allows 
to evaluate the performanc e of our estimator. More details on the data and their 



acquisition can be found in iPatting et al.l ( 120071 ). 



We applied the estimator from Section [3] with the sine basis to this dataset. The 
numerical constants are k' = 1 and k" = 0.001. Figure [5] (b) shows the estimation 
result in comparison to the exponential density with mean 2.54. We observe that the 
estimated function is quite close to the 'true' one. This indicates that the estimation 
procedure takes the errors present in the real data adequately into account and that 
the modeling by the pile-up distortion and additive measurement errors is appropriate. 

We conclude that the estimation methods proposed in this paper have a satis- 
factory behavior in various settings and give rather good results on both synthetic 
and real data. Nevertheless, we observed that the performance depends on the choice 
of the basis and on the smoothness of the target density. Here only two bases are 
considered, but others should work as well and may improve the results in certain 
settings. 

5 Proofs 



5.1 Proof of Proposition 12.1 



Pythagoras formula yields \\f - f m \\ 2 = \\f ~ fm\\ 2 + \\f m ~ fm\\ 2 - By definition of 
the orthogonal projection f m = J2\eA a xfx an d by using equality (@J, we have 
a x = (lp x Jy) = E(<p x (Y)) = H<P\{Zi)w o G(Z 1 )). This, together with formula © 
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fluorescence measurements 
-noise distribution 




2 3 4 5 6 7 



(a) (b) 

Figure 5: (a) Fluorescence lifetime measurements (solid line) and independent sample 
of the noise distribution (dashed), (b) Density estimator (solid) and 'true' exponential 
density with mean 2.54 (dashed). 



implies that ||/ TO - f rn \\ 2 = ^ AeAm (a A - ^a) 2 - If we define 

1 " 

v n (h) = -J2iHZi)w o G(Zi) - E(h(Zi) w o 



(26) 



i=l 



1 - 

R n (h) = -Y h(Zi)[w o G n {Zi) - w o G{Zi)\ , (27) 
i=i 

then we get ||/ m - / m || 2 < 2 X) AeAm (^n(^A) 2 + -R„(^a) 2 )- We have, on the one hand, 
£ E(^ A ))= £ -Va% A (Z>oG(Z 2 ,))< £ ±E [<^i)(™ o G(Z X )) 2 ] 



AGAr 



< -E 
n 



AgA„ 



AeA m 

< $ ^E[( W o G(Zi)) 2 ] < $0^?— , (2* 
n n 



because the basis satisfies ffTOl). On the other hand, we have 



£ E (^n(^A)) < £ E 



AeA„ 



AeA„ 



~ ^ ^Ap^i)^ ° G n {Zj) -woG(Zi)} 
i=i , 



< 



i n 



«=i aga„ 



< c 2 £ E (||G - 6„|MZ,)) < c^o^E - GnllL) < 4*o^ (29) 



aga„ 



with (JS]) and because of E [\\G — GnW^j < 1/n (see e.g. iBrunel and Comtd . I2005L p. 
462). By gathering all terms, we obtain the risk bound stated in Proposition 12.11 □ 
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5.2 Sketch of proof of Theorem 12.11 

We can write j n (t) - 7„(s) = ||t - f Y \\ 2 - \\s - f Y \\ 2 - 2u n (t -s) - 2R n (t - s), 
where v n and R n are defined by (1261) and (1271) . By definition of / m we have for all 
m e M n , 7n(/m)+pen(m) < j n (/ m )+pen(m). This can be rewritten as ||/m-/y|| 2 < 
||/m - /y II 2 + pen(m) + 2i/ n (/ A - f m ) - pen(m) + 2R n {fm - f m ). Using this and and 
that 2xy < x 2 /9 + 9y 2 for all nonnegative x, y, 8, we obtain 

||/y - /m|| 2 < || h - / m || 2 + pen(m) + 2z/ n (/ A - / m ) - pen(m) + 2R n (fm - f m ) 



Wfr - /m|| 2 < ||/y - /m|| 2 + pen(m) + 2||/ m - / m || sup Wn(t)\ - pen(m) 

teS A +S m ,||*||=i 

+ 2||/ m -/ m || sup \R n (t)\ 
tes^+s-m, ||t||=i 

< ||/y - /mf + pen(m) + -||/^ - / m || 2 + 2 sup M*)] 2 

4 tes A +s m ,||t||=i 

-pen(7fi) + ^||/ A -/ m || 2 + 8 sup [R n (t)} 2 . 
° te5 A +S m ,||t||=i 

As ||/ A - / m ,|| 2 < 2(114 - /|| 2 + ||/ m - /|| 2 ), this yields 



^E[||/-/ m || 2 ] < I||/-/ m || 2 + 2 pen(m) + 8E( [R n (t)f 



4 



tes mn ,\\t\\=i 



+4E sup K(/)] 2 - (pen(m) + pen(m))/4 

\t£S, h +S m ,\\t\\ = l 

Then the term E ^snp teS7h+Sm ^ =1 [u n (t)} 2 — (pen(m) + pen( m))/4) is b ounded 
by C/n by using Talagrand Inequality in a standard way (see e.g. iBrunel et al.l . 120051 ) . 
For the last term E (sup ie5mii || t || =1 [-R n (t)] 2 ) , we define by 



n G = {V^\\G n -G\\ 00 < VUnj}. 



Now, we know from iMassartl (ll990l ) that 

P( v ^||Gn-G|| 00 >A)<2e- 2A2 . 



(30) 



(31) 



This implies that P(f^) < 2/n 2 . Then we write that E (sup te5m || 4 || =1 [-R n (t)] 2 ) is less 
than 



sup [Rn{t)ln G ] 

, t^Sm, n j \\t\\ — 1 



E 



teS„ 



sup [i? n (t)W 2 
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For the first term, we have 

n x < c 2 w e 



n 



(1 
sup (-$>(^)|) 5 

sup (-J> 2 (^)) 

5 mn ,||t||=i J 



< 2c; 



ln(n) 



E 



sup 

teS m „,||t||=i 



u' n {t 2 )\ + sup E(f (Z0) 



where = j£" =1 0(Zi) ~ E (^( z i))- !t is proved in Brunei and Comtel fl2005f ) 

that E (sup t& £ || t |i =1 |^(t 2 )|) < Cln(n) if the density of Zi is bounded and N n < 
0(n) for bases m [DP] and [W] and N n < 0(y/n) for basis [T]. Moreover E(i 2 (Zi)) < 
II ^ II 2 II /v ||oo/wo. We obtain 1Z\ < Cln 2 (n)/n. On the other hand, we have 

n 2 < £E(i£foOi n e)<£^ 

I i 

A 

This yields E (sup tgSm ^ =1 [R n (t)] 2 ^ < Cln 2 (n)/n. Finally we obtain that, for all 

m G M n , E[\\f - f^W 2 ] < 7\\f - f m \\ 2 + 8pen(m) + Kln 2 (n)/n, which ends the proof. 
□ 



5.3 Proof of Proposition 13.1 



We have ||/ m - / y || 2 = (2tt)- 1 ||/* - / y || 2 = (2n)-\\\JZ - / y J| 2 + \\ft m - / y || 2 ). 

rlm du 1 



p* .f* ||2 
'.mil 



Wfm fy,r 



|/»| 3 n* 



[e~ luZk w o G n (Z fc ) - E(e- iuZ »w o G(Z k )) 

k=l 



< 2 



+ 2 



1 dw 1 
wm du 



7rm I J rj 



\f;(u)\ 2 n 2 



[e~ iuZk w o G(Z k ) - E 



fe=l 



(32) 



The expectation of the first term on the right-hand side of f!32p is less than or equal 
to 



a 

-E 

k=l 



du 



I #(«) I 



-E(|»oG n (Z,)- W oG(Z,)| 2 ) 



< c£e(||G b -G|£,) / dVI/;Wr<27rc lC ; 



A„(W) 



n 
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by us ing E(||G n — < Ck/n k (see e.g. Lem ma 6.1 p . 462, iBrunel and Comte 

( 20051 ) which is a straightforward consequence of iMassartl (119901 ) 1. Here Cfc is a nu- 



merical constant that depends on k only. The expectation of the second term on the 
right-hand side of ( )32l) is a variance and less than or equal to 



2 n 



Gathering the terms completes the proof of Proposition 13.11 □ 



5.4 Proof of Theorem 13.1 



We have the following decomposition of the contrast for functions s,t in S m , 

ll(t) - T t(s) = \\t - fyf - \\s - f Y \\ 2 - 2v n {t -s)- 2R n (t - s) , (33) 

where 

- (t) - 1 f / **(-") l e " mZfc (^ ° G)(Z k ) - Eje^jw o G){Z k ))] 

VnK } 2nn^J ' /•(«) U ' 1 } 

and 

= 2^X<J f*[u) dU [{W ° Gn){Zk) " (W ° G){Zk)] ■ (35) 

We start with decomposition f )33|) . We take t = fm and s = /y, m . Since 7^(/m) + 
pen(m) < 7+(/ m ) + pen(m), we get 

lE[||/ y -jy 2 ] < I||/ y -/ ym || 2 + pen(m)+4E( sup [u n (t)] 2 ) - E(pen(m)) 
4 4 yeB m ,™ y 

+8E [ sup [Rn{t)] 2 J , (36) 

where z/ n (t) and -R n (t) are defined by (IMj) and (|35|) and B m = {i G £ m , = 1}, and 
B m ,m' = {t G 5*™ + i^m') 11*11 = 1}- Following a classical applicat ion of Talagrand In - 



equality in the deconvolution context for ordinary smooth noise (IComte et al.l . 120061 1 , 
we deduce the following Lemma. 

Lemma 5.1 Under the Assumptions of Theorem \3.1\ 

El sup [v n (t)T -pi(m,m) ) < -, 
\teB m ,rn J n 

where pi(m, m') = 2E((u> o G) 2 {Z\)) A v (m V m')/n = 2(J Q 1 w 2 (u)du)A v (m V m')/n. 
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Moreover for the study R n (t) we have the following Lemma. 
Lemma 5.2 Under the assumptions of Theorem \3.1\ 

E [ sup [R n (t)] 2 -p 2 (m,m) J < 0, 
where P2{fn, m!) = c 2 v A v (m V m') ln(n)/n. 

It follows from the definition of Pi(m, m'), i = 1,2, that there exist numerical constants 
k' and k", namely k', k" > 8, such that 4pi(m, m') + 8p2{m, m') < pen(m) + pen(m'). 
Now, starting from (136]) . we get, by applying Lemmas 15 . 1 1 and 15 . 2[ 

^E[||/ y - U\\ 2 } < I||/ y -/ yim || 2 + pefT( m )+4E 



teB„ 



+ 



+ 8E sup [R n (t)} 2 — P2(m, m) 1 + E[4pi(m, m) + &p 2 {m, m) — pen(m)] 

< ^||/y-/y, m || 2 + 2i5en(m) + -. 
4 n 

Therefore if « > 16, we get (l/4)E[||/y - /™|| 2 ] < (7/4) \\f Y - f Y , m \\ 2 + 2pen(m) + c/n. 
This completes the proof of Theorem 13.11 □ 

Proof of Lemma \5.2[ First we remark that, with Cauchy-Schwarz inequality, we have 

2 



i*»wi 2 < -A 



< 

~ 47T 



J f' *i ( ) 

? / |t,( " )|2du /-„ » U £ 1(111 ° <i " )(z ' ) - (tt ' ° G)(Zl) 



fc=i 

*l|2 _ (WIUII2 



Then Parseval Formula gives = 27r||i|| and we find 



sup \R n (t)\ 2 < 4A„(mVm) - |G n (Z fc ) - G(Z fc )| 2 ) < ci\{m\/m)\\G n -G 



n 

k=l 



I 2 • 



Now, we write sup tgBm _ \R n (t)\ 2 = IZ1 + IZ2 by inserting again the indicator functions 
lfi G and where Qg is defined by (1301) . Therefore 

El sup [R n {t)f - p 2 {m, m) J < E(fti - p 2 (m, m)) + E(ft 2 ) 

< c 2 E U^m V m)(||G n - G^l^ - (37) 
+ CU ; 2 A,(m„)E(||a t -G'|| 2 0O l^). 
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Next (\\G n — G\\l tci G —hi(n)/n)) < by definition of Qg for the first right-hand-side 
term of ( J37j) . For the second term, A(m„) < n by the definition of m n , \\G n — G||oo — 1 
and it follows from (JSJ that P(fi^) < 2/n 2 . Therefore 

E I sup [R n {t)] 2 -p 2 {m,m) ] < 4nP(0&) < 2c 2 w /n. 

Gathering the bounds gives the result of Lemma 15.21 □ 
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